Optimal control of thermostatically controlled loads connected to a districtheating network is considered a sequential decision- making problem underuncertainty. The practicality of a direct model-based approach is compromisedby two challenges, namely scalability due to the large dimensionality of theproblem and the system identification required to identify an accurate model.To help in mitigating these problems, this paper leverages on recentdevelopments in reinforcement learning in combination with a market-basedmulti-agent system to obtain a scalable solution that obtains a significantperformance improvement in a practical learning time. The control approach isapplied on a scenario comprising 100 thermostatically controlled loadsconnected to a radial district heating network supplied by a central combinedheat and power plant. Both for an energy arbitrage and a peak shavingobjective, the control approach requires 60 days to obtain a performance within65% of a theoretical lower bound on the cost.
展开▼